About our human full-length cDNA sequencing projects

 

Human gene number was estimated to be 20-25 thousand. However number of human mRNA varieties was predicted to be about 100 thousand. The varieties are thought to be caused by variations of TSS and splicing. In our previous human cDNA project, about 30 thousand of FLJ human full-length sequenced cDNAs were deposited to DDBJ/GenBank/EMBL, and we obtained about 1.4 million of 5'-end sequences (5'-EST) of FLJ full-length cDNAs from about 100 kinds of cDNA libraries consist of human tissues and cells constructed by oligo-capping method. And our FLJ cDNAs were covered about 80% of human genes. In these situations we developed efficient human splicing variant cDNA cloning and evaluation systems (Fig. 1-1) in our project. More than one thousand of finished grades of full-length sequenced cDNAs were obtained in this project. Then we constructed the sequence analysis databases focused on mRNA variations using human genome and cDNA sequences, FLJ full-length sequenced cDNAs, 5'-ESTs of FLJ full-length cDNAs and other public cDNA sequences.

 

Fig. 1-1 Overview of the cDNA evaluation system

 

1) cDNAs constructed by oligo-capping method

A full-length cDNA can be an important material for empirical analysis of the function of gene products (Fig.1-2). However, it has been extremely difficult to efficiently isolate full-length cDNAs from the cDNA libraries constructed by conventional methods. Sugano et al. developed the "oligo-capping method" (1-2) which replaces the cap structure specific to the 5' end of eukaryotic mRNA with a synthetic oligonucleotide. This method enables us to isolate full-length cDNAs efficiently by constructing cDNA libraries from these mRNAs with a specific sequence ligated to the 5' end. Moreover, we improved the oligo-capping method by optimizing all steps (3). Full-length rates of human cDNA libraries constructed using the improved oligo-capping method were 90% and the majority of the cDNA insert sizes was over 2 kb.

(1) Maruyama, K. and Sugano, S. (1994) Gene 138: 171-174; (2) Suzuki, Y. et al. (1997) Gene 200: 149-156; (3) Ota, T. et al., WO 01/04286.

 

Fig. 1-2 Human full length cDNA project

 

2) About our human cDNA project

(METI full-length human cDNA project in Japan from 1996 to 2006)

 

i) HRI full-length human cDNA project supported by METI through Japan Key Technology Center from 1996 to 1998: PSEC cDNAs, 246 sequences

- Collaboration with Dr. S. Sugano, UT for construction of cDNA libraries by oligocapping method

- Focused on secretion and membrane proteins (PSEC cDNAs)

 

ii) NEDO full-length human cDNA sequencing project (FLJ-PJ) by oligo-capped full-length cDNAs from 1998 to 2002 supported by METI: 30,063 sequences

- Collaboration with RAB, HRI, UT and NITE

- In this project, 1.5 million of 5'-ESTs by full-length human cDNAs from cDNA libraries using oligocapping method: 5'-ESTs, 1,456,213 sequences

 

iii) Human cDNA sequencing project focused on splicing variants of mRNA (SV-PJ) in NEDO functional analysis of protein and research application project (FAP-PJ) from 2002 to 2006 supported by METI: More than one thousand

- Collaboration with JBIC included REPRORI and Hitachi, AIST and NITE

 

iv) Construction of FLJ human cDNA database from 2002 to 2007 mainly by REPRORI and Hitachi

and from 2007 to 2012 by UT

 

AIST: National Institute of Advance Industrial Science and Technology, Japan

Hitachi: Hitachi, Ltd., Japan

HRI: Helix Research Institute supported by Japan Key Technology Center and 10 companies, Japan

JBIC: Japan Biological Informatics Consortium, Japan

METI: Ministry of Economy, Trade and Industry, Japan

NEDO: New Energy and Industrial Technology Developmental Organization, Japan

NITE: National Institute of Technology and Evaluation, Japan

RAB: Research Association for Biotechnology, Japan

REPRORI: Reverse Proteomics Research Institute supported by 11 companies, Japan

UT: The University of Tokyo, Japan


(Mar. 7, 2012)